AITopics | Province of Cebu

Collaborating Authors

Province of Cebu

Auslan-Daily: Australian Sign Language Translation for Daily Communication and News

Neural Information Processing SystemsFeb-18-2026, 04:02:11 GMT

Considering different geographic regions generally have their own native sign languages, it is valuable to establish corresponding SL T datasets to support related communication and research. Auslan, as a sign language specific to Australia, still lacks a dedicated large-scale dataset for SL T.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(22 more...)

Industry:

Education > Curriculum > Subject-Specific Education (0.96)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

feb34ce77fc8b94c85d12e608b23ce67-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsOct-9-2025, 12:52:44 GMT

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(22 more...)

Industry:

Health & Medicine (0.69)
Media (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Semantic Decomposition and Selective Context Filtering -- Text Processing Techniques for Context-Aware NLP-Based Systems

Villardar, Karl John

arXiv.org Artificial IntelligenceFeb-19-2025

In this paper, we present two techniques for use in context-aware systems: Semantic Decomposition, which sequentially decomposes input prompts into a structured and hierarchal information schema in which systems can parse and process easily, and Selective Context Filtering, which enables systems to systematically filter out specific irrelevant sections of contextual information that is fed through a system's NLP-based pipeline. We will explore how context-aware systems and applications can utilize these two techniques in order to implement dynamic LLM-to-system interfaces, improve an LLM's ability to generate more contextually cohesive user-facing responses, and optimize complex automated workflows and pipelines.

arxiv, information, language model, (14 more...)

arXiv.org Artificial Intelligence

2502.14048

Country:

Asia > South Korea (0.04)
Asia > Philippines > Visayas > Central Visayas > Province of Cebu > City of Cebu (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine (1.00)
Banking & Finance (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Enhancing coastal water body segmentation with Landsat Irish Coastal Segmentation (LICS) dataset

O'Sullivan, Conor, Kashyap, Ambrish, Coveney, Seamus, Monteys, Xavier, Dev, Soumyabrata

arXiv.org Artificial IntelligenceSep-5-2024

Ireland's coastline, a critical and dynamic resource, is facing challenges such as erosion, sedimentation, and human activities. Monitoring these changes is a complex task we approach using a combination of satellite imagery and deep learning methods. However, limited research exists in this area, particularly for Ireland. This paper presents the Landsat Irish Coastal Segmentation (LICS) dataset, which aims to facilitate the development of deep learning methods for coastal water body segmentation while addressing modelling challenges specific to Irish meteorology and coastal types. The dataset is used to evaluate various automated approaches for segmentation, with U-NET achieving the highest accuracy of 95.0% among deep learning methods. Nevertheless, the Normalised Difference Water Index (NDWI) benchmark outperformed U-NET with an average accuracy of 97.2%. The study suggests that deep learning approaches can be further improved with more accurate training data and by considering alternative measurements of erosion. The LICS dataset and code are freely available to support reproducible research and further advancements in coastal monitoring efforts.

coastline, pixel, segmentation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.rsase.2024.101276

2409.15311

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Pacific Ocean > North Pacific Ocean > East China Sea > Yellow Sea (0.04)
Oceania > Australia (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.48)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine Learning Framework for High-Resolution Air Temperature Downscaling Using LiDAR-Derived Urban Morphological Features

Chajaei, Fatemeh, Bagheri, Hossein

arXiv.org Artificial IntelligenceAug-31-2024

Climate models lack the necessary resolution for urban climate studies, requiring computationally intensive processes to estimate high resolution air temperatures. In contrast, Data-driven approaches offer faster and more accurate air temperature downscaling. This study presents a data-driven framework for downscaling air temperature using publicly available outputs from urban climate models, specifically datasets generated by UrbClim. The proposed framework utilized morphological features extracted from LiDAR data. To extract urban morphological features, first a three-dimensional building model was created using LiDAR data and deep learning models. Then, these features were integrated with meteorological parameters such as wind, humidity, etc., to downscale air temperature using machine learning algorithms. The results demonstrated that the developed framework effectively extracted urban morphological features from LiDAR data. Deep learning algorithms played a crucial role in generating three-dimensional models for extracting the aforementioned features. Also, the evaluation of air temperature downscaling results using various machine learning models indicated that the LightGBM model had the best performance with an RMSE of 0.352{\deg}K and MAE of 0.215{\deg}K. Furthermore, the examination of final air temperature maps derived from downscaling showed that the developed framework successfully estimated air temperatures at higher resolutions, enabling the identification of local air temperature patterns at street level. The corresponding source codes are available on GitHub: https://github.com/FatemehCh97/Air-Temperature-Downscaling.

air temperature, building footprint, resolution, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.uclim.2024.102102

2409.0212

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.07)
Asia > Singapore (0.04)
(23 more...)

Genre: Research Report > New Finding (0.69)

Industry: Energy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Effectiveness of Edge Detection Evaluation Metrics for Automated Coastline Detection

O'Sullivan, Conor, Coveney, Seamus, Monteys, Xavier, Dev, Soumyabrata

arXiv.org Artificial IntelligenceMay-19-2024

We analyse the effectiveness of RMSE, PSNR, SSIM and FOM for evaluating edge detection algorithms used for automated coastline detection. Typically, the accuracy of detected coastlines is assessed visually. This can be impractical on a large scale leading to the need for objective evaluation metrics. Hence, we conduct an experiment to find reliable metrics. We apply Canny edge detection to 95 coastline satellite images across 49 testing locations. We vary the Hysteresis thresholds and compare metric values to a visual analysis of detected edges. We found that FOM was the most reliable metric for selecting the best threshold. It could select a better threshold 92.6% of the time and the best threshold 66.3% of the time. This is compared RMSE, PSNR and SSIM which could select the best threshold 6.3%, 6.3% and 11.6% of the time respectively. We provide a reason for these results by reformulating RMSE, PSNR and SSIM in terms of confusion matrix measures. This suggests these metrics not only fail for this experiment but are not useful for evaluating edge detection in general.

edge map, metric, threshold, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/PIERS59004.2023.10221292

2405.11498

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Asia > Philippines > Visayas > Central Visayas > Province of Cebu > City of Lapu-Lapu (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Interpreting a Semantic Segmentation Model for Coastline Detection

O'Sullivan, Conor, Coveney, Seamus, Monteys, Xavier, Dev, Soumyabrata

arXiv.org Artificial IntelligenceMay-19-2024

We interpret a deep-learning semantic segmentation model used to classify coastline satellite images into land and water. This is to build trust in the model and gain new insight into the process of coastal water body extraction. Specifically, we seek to understand which spectral bands are important for predicting segmentation masks. This is done using a permutation importance approach. Results show that the NIR is the most important spectral band. Permuting this band lead to a decrease in accuracy of 38.12 percentage points. This is followed by Water Vapour, SWIR 1, and Blue bands with 2.58, 0.78 and 0.19 respectively. Water Vapour is not typically used in water indices and these results suggest it may be useful for water body extraction. Permuting, the Coastal Aerosol, Green, Red, RE1, RE2, RE3, RE4, and SWIR 2 bands did not decrease accuracy. This suggests they could be excluded from future model builds reducing complexity and computational requirements.

accuracy, prediction, spectral band, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/PIERS59004.2023.10221387

2405.115

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Oceania > Australia (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Philippines > Visayas > Central Visayas > Province of Cebu > City of Lapu-Lapu (0.04)

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Colexifications for Bootstrapping Cross-lingual Datasets: The Case of Phonology, Concreteness, and Affectiveness

Chen, Yiyi, Bjerva, Johannes

arXiv.org Artificial IntelligenceJun-5-2023

Colexification refers to the linguistic phenomenon where a single lexical form is used to convey multiple meanings. By studying cross-lingual colexifications, researchers have gained valuable insights into fields such as psycholinguistics and cognitive sciences [Jackson et al.,2019]. While several multilingual colexification datasets exist, there is untapped potential in using this information to bootstrap datasets across such semantic features. In this paper, we aim to demonstrate how colexifications can be leveraged to create such cross-lingual datasets. We showcase curation procedures which result in a dataset covering 142 languages across 21 language families across the world. The dataset includes ratings of concreteness and affectiveness, mapped with phonemes and phonological features. We further analyze the dataset along different dimensions to demonstrate potential of the proposed procedures in facilitating further interdisciplinary research in psychology, cognitive science, and multilingual natural language processing (NLP). Based on initial investigations, we observe that i) colexifications that are closer in concreteness/affectiveness are more likely to colexify; ii) certain initial/last phonemes are significantly correlated with concreteness/affectiveness intra language families, such as /k/ as the initial phoneme in both Turkic and Tai-Kadai correlated with concreteness, and /p/ in Dravidian and Sino-Tibetan correlated with Valence; iii) the type-to-token ratio (TTR) of phonemes are positively correlated with concreteness across several language families, while the length of phoneme segments are negatively correlated with concreteness; iv) certain phonological features are negatively correlated with concreteness across languages. The dataset is made public online for further research.

computational linguistic, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.02646

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
(19 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Industry: Banking & Finance > Credit (0.34)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

Add feedback

Utilization of Multinomial Naive Bayes Algorithm and Term Frequency Inverse Document Frequency (TF-IDF Vectorizer) in Checking the Credibility of News Tweet in the Philippines

Riego, Neil Christian R., Villarba, Danny Bell

arXiv.org Artificial IntelligenceMay-30-2023

The digitalization of news media become a good indicator of progress and signal to more threats. Media disinformation or fake news is one of these threats, and it is necessary to take any action in fighting disinformation. This paper utilizes ground truth-based annotations and TF-IDF as feature extraction for the news articles which is then used as a training data set for Multinomial Naive Bayes. The model has an accuracy of 99.46% in training and 88.98% in predicting unseen data. Tagging fake news as real news is a concerning point on the prediction that is indicated in the F1 score of 89.68%. This could lead to a negative impact. To prevent this to happen it is suggested to further improve the corpus collection, and use an ensemble machine learning to reinforce the prediction

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2306.00018

Country:

Asia > Philippines > Luzon > National Capital Region > City of Manila (0.05)
South America > Colombia > Meta Department > Villavicencio (0.05)
Asia > Philippines > Visayas > Central Visayas > Province of Cebu > City of Lapu-Lapu (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.82)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

FACTIFY-5WQA: 5W Aspect-based Fact Verification through Question Answering

Rani, Anku, Tonmoy, S. M Towhidul Islam, Dalal, Dwip, Gautam, Shreya, Chakraborty, Megha, Chadha, Aman, Sheth, Amit, Das, Amitava

arXiv.org Artificial IntelligenceMay-28-2023

Automatic fact verification has received significant attention recently. Contemporary automatic fact-checking systems focus on estimating truthfulness using numerical scores which are not human-interpretable. A human fact-checker generally follows several logical steps to verify a verisimilitude claim and conclude whether its truthful or a mere masquerade. Popular fact-checking websites follow a common structure for fact categorization such as half true, half false, false, pants on fire, etc. Therefore, it is necessary to have an aspect-based (delineating which part(s) are true and which are false) explainable system that can assist human fact-checkers in asking relevant questions related to a fact, which can then be validated separately to reach a final verdict. In this paper, we propose a 5W framework (who, what, when, where, and why) for question-answer-based fact explainability. To that end, we present a semi-automatically generated dataset called FACTIFY-5WQA, which consists of 391, 041 facts along with relevant 5W QAs - underscoring our major contribution to this paper. A semantic role labeling system has been utilized to locate 5Ws, which generates QA pairs for claims using a masked language model. Finally, we report a baseline QA system to automatically locate those answers from evidence documents, which can serve as a baseline for future research in the field. Lastly, we propose a robust fact verification system that takes paraphrased claims and automatically validates them. The dataset and the baseline model are available at https: //github.com/ankuranii/acl-5W-QA

covid-19 vaccine, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2305.04329

Country:

North America > United States > Virginia (0.05)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.05)
Asia > China > Hubei Province > Wuhan (0.05)
(15 more...)

Genre: Research Report (0.82)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports > Football (1.00)
Law (1.00)
(6 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback